Glossary
1 Page Visit - A visit that only requests a single non-graphic but which may many any number of requests for graphics.
Agent - There is always some piece of software acting as the "agent" for the user making the request. There is a standard way for that software to tell the web server it's name, version number, and possibly other information.
The agent information is not always put into the log file. NCSA Combined logs contain it. WebSTAR logs agent information when the "AGENT" or "CS(USER-AGENT)" tokens are included. Microsoft IIS calls this field "User Agent"
Many web browsers "lie" about their identity. Some web browsers can be directly configured by the end user to send what ever string the user wants. Microsoft Internet Explorer will normally "pretend" to be Netscape Navigator, as in "Mozilla/4.0 (compatible; MSIE 4.01; Windows 95)", which starts with "Mozilla" the standard agent tag for Netscape Navigator. But it then indicates who it really is with the "compatible; MSIE 4.01;" portion. Some browsers just lie outright, claiming to be Netscape Navigator and not giving any hints that that isn't true.
Auth. User - The server can be configured to require authorization, the entry of a user name and password, to access a page. The Auth. User is simply the string typed into the name field of the authorization dialog. This name is not present for pages which are freely accessible and does not necessarily have anything to do with the actual name of the person making the request. NCSA Common and Combined logs both have the Auth User name. WebSTAR logs this name when the USER token is in the log format. Microsoft IIS calls this field "User Name"
BPS - Bits per second. This is a rate of data transfer, common modems are capable of 33.6K or 56K BPS. A T1 line is 1.5 Meg BPS.
Browser - The name of the web browser used to make the request. This is derived from the agent string and suffers some of the same "lying" issues it does. Summary decodes the standard methods of partially hiding the identity of the browser in most cases.
CGI Arguments - A URL can optional contain a question mark, the portion of the URL to the right of the question mark is often referrer to as the query string, search argument, URI query, or as the CGI argument. This portion is traditionally passed to a CGI program for interpretation.
Cookie - A web server can send "cookie" information down to a web browser, which will then supply that information back to the server along with each subsequent request. Some end users disable this feature of their web browsers. WebSTAR logs cookie information when the "CS(COOKIE)" token is included.
Curr - Current, the "current" time period. Normally used as a prefix as in "Curr Hits" which means hits in the current time period. The length of the current time period is configurable, by default it is one week.
Destination - The first non-graphic request following the current one in a visit.
Domain - On the internet most computers are given names which can be used to access them over the internet. These names are called domain names. Domain names consist of two or more parts separated by periods, for example "summary.net". You can refer to all of the computers that share some right hand portion of a name as being in the same domain, for example "www.summary.net" and "mail.summary.net" are both in the "summary.net" domain. In Summary the domain name is considered to be the right most two or three segments of the name. Summary decides when to use two and when to use three segments in an attempt to match the domain to a company or organization, more segments might typically refer to a single computer, fewer to a country.
Download - A request for a file that is stored or decoded into a file in the users file system, as opposed to being displayed on the screen as part of a web page. Summary uses the file name extension to determine if a request is a download. The set of extensions used to make this determination can be configured.
DNS Lookup - the process of resolving an IP address (192.168.11) to a host name (summary.net). DNS names are registered with the global name server. Most web servers can be configured do DNS lookups on the IP address of incoming requests, but is more efficient to have the web server not do it. Either the web server or summary can do the lookups. Someone must do the lookups if you want the top level domain and domains reports to work. If the Domain report is empty, it could be that DNS lookups are turned off both in the server and in Summary.
Enter/Entry Point - The first non-graphic requested as part of a visit. This is the point where a user enters your site.
Enters - Shows the number of visits whose first non-graphic request (where they started) was for that page.
Error - A request which resulted in an error code being sent to the browser. The most common error is 404 - File Not Found. Any result code 400 or higher is treated as an error.
Exit Point - The last non-graphic requested as part of a visit.
File Type - The file name extension in a request is taken to indicate the file type in Summary.
Graphic - A request for a file containing an image. Summary uses the file name extension to determine if a request is for a graphic. The set of extensions used to make this determination can be configured. Requests for graphics are not counted as steps in a visit.
Hit - A single request is often called a "hit" on the web site. Saying there were "56 hits" on an item means that there were 56 separate requests for that item. The item may be a specific file, a particular referrer, or some other use of a resource by a single request.
Host - A computer is often referred to as a host when talking about networks. Each computer is assigned a unique IP address. There are some exceptions, where several computers will share a single IP address, or one computer can have several IP addresses. In Summary, each unique IP address is referred to as a host.
Local Referrer - A referrer is local to a site if it is in the same domain or in a domain which is equivalent to the domain that the associated request is in. A referrer with the "Preferred domain name" or any of the "Other local domain names" are counted as local.
Method - Each request must contain a method. The most common method is "GET", which means simply get the requested item. A "HEAD" request means to get information about the item, such as size and last date modified. A browser will often keep copies of items in their cache and then use a "HEAD" method to check if the item has been modified since it was put in the cache.
Others - Any request which is not for a page, graphic, or download.
Page - A request for a web page. Summary uses the file name extension to determine if a request is for a page. The set of extensions used to make this determination can be configured.
Path - A sequence of requests for non-graphics in a single visit. Summary only keeps track of the first three requests, the last request, and whether there were more than four requests in the path or not.
Platform - The name of the operating system and/or hardware used to make the request. This is derived from the agent string and suffers some of the same "lying" issues the agent string does. Summary decodes the most common platforms based on internal rules which work with the vast majority of requests.
Referrer - The web browser generally provides the most recent previous URL when making a request, this is called the referrer. There are two major kinds of referrers. A page that contains graphics will appear as the referrer for the requests for the graphics. When a user clicks on a link that points to your site, the URL of the page containing the link is sent as the referrer.
The referrer information is not always put into the log file. NCSA Combined logs contain it. WebSTAR logs referrer information when the "REFERER" or "CS(REFERER)" tokens are included.
Reload - A request for an item followed by another request for the same item with no other requests in between in the same visit. These can be cause by the user hitting the reload button, but subsequent attempts to complete a failed download, and because requests that would otherwise have been in-between were satisfied by a cache.
Request - When you type a URL into a web browser, it sends a request for the item named by that URL to the server. Request can mean the entire request or specifically the name of the item contained in the request.
Search Phrase - Summary attempts to extract the search string typed by user into any of the major search engines that was used to find your site. This information is extracted from the referrer. The entire string is called the search phrase.
Source - The most recent non-graphic request before the current one in a visit.
Steps - Each non-graphic request in a visit is counted as one step. The first request is step one, the second is step two, and so on. Steps are normally displayed as the average of many step numbers for the same item from different visits.
Top Level Domain - The last component of a domain name. For example the domain "summary.net" has a top level domain of "net". There are many two letter "country code" top level domains, and only a few longer ones. There is currently a movement to increase the number of longer, non-country, domains.
Unique Hosts - The number of distinct IP addresses and host names making requests. This may be used as a rough estimate of the number of distinct people accessing your site, even though it does not exactly correspond to people. There are two major reasons why this number does not directly count people, and some other minor ones. Some accesses are made through proxy servers or NAT gateways, machines that have a single IP address but may be in use by multiple people. AOL and some of the other large service providers always route requests through proxy servers. Dial-up connections usually have a different IP address each time you dial-up, so a single person accessing your site over the course of several different dial-up sessions will have several different IP addresses.
Virtual Domain - When you are supporting multiple domains on a single server, each domain being served is often referred to as a virtual domain. Different server software implements or defines virtual domains in different ways. The strict definition of virtual domain is when a single IP address is shared between multiple domains. Summary uses the term virtual domain in any situation involving more than one domain. See also Virtual Server.
Virtual Server - When you are supporting multiple domains with a single server program, each domain is said to have a virtual server. Summary uses virtual server to refer to a domain or server name in any situation where more than one domain or server is involved. Summary looks at the name of the server either from the domain name that the user typed as part of the request, the IP address which received the request, the configured server name, or the name of the sub-folder in the Logs folder, and calls that the virtual server name. In some cases these will refer to actual (as opposed to virtual) servers. See also Virtual Domain.
Visit - A sequence of requests all made from the same IP address, with no gap between requests exceeding a time limit (normally 30 minutes). The time limit is configurable. This normally represents a single person moving through your web site, but there can be exceptions. A proxy machine used by several people could result in several different people accessing the site from the same IP address within the time limit. It is also possible for a single person to make different requests to your site from multiple IP addresses at the same time. Both of these exceptions are usually rare, generally accounting for a small portion of all visits. Very high traffic sites tend to experience these exceptions more often.
Web Robot - A program making a request that is not in direct response to a person making a request of that program is though of as a Web Robot. Web robots are used for several purposes, such as search engine indexing robots, link checkers, e-mail address extractors, and update watchers. Summary has an internal database of common known Web Robots, determined from the agent string. Any host making a request for '/robots.txt' is counted as a possible web robot. The '/robots.txt' file is frequently used by robots to know which portions of your site should be avoided by robots. |